Fast & Confident Probabilistic Categorisation

نویسنده

Cyril Goutte

چکیده

We describe NRC’s submission to the Anomaly Detection/Text Mining competition organised at the Text Mining Workshop 2007. This submission relies on a straightforward implementation of the probabilistic categoriser described in [4]. This categoriser is adapted to handle multiple labelling and a piecewise-linear confidence estimation layer is added to provide an estimate of the labelling confidence. This technique achieves a score of 1.689 on the test data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Probabilistic Model for Fast and Confident*

Permission is granted to quote short excerpts and to reproduce figures and tables from this report, provided that the source of such material is fully acknowledged.

متن کامل

A Probabilistic Neighbourhood Translation Approach for Non-standard Text Categorisation

The need for non-standard text categorisation, i.e. based on some subtle criterion other than topics, may arise in various circumstances. In this study, we consider written responses to a standardised psychometric test for determining the personality trait of human subjects. A number of state-of-the-art text classifiers that having been very successful in standard topic-based classification pro...

متن کامل

Probabilistic Models for Hierarchical Clustering and Categorisation: Applications in the Information Society

|We propose a new hierarchical generative model for textual data, where words may be generated by topic speciic distributions at any level in the hierarchy. This model is naturally well-suited to clustering documents in preset or automatically generated hierarchies, as well as cat-egorising new documents in an existing hierarchy. Furthermore , we present a series of applications that can beneet...

متن کامل

Aeóû Ø Ôôöøññòø Ó Óñôùøøö Ëëëëòòò¸éùùùò Ååöý ² Ï×ø¹ Ðð Óððððð¸íòòúö××øý Ó Äóòòóòº

The automatic categorisation of web documents is becoming crucial for organising the huge amount of information available in the Internet. We are facing a new challenge due to the fact that web documents have a rich structure and are highly heterogeneous. Two ways to respond to this challenge are (1) using a representation of the content of web documents that captures these two characteristics ...

متن کامل

Indexing for Fast Categorisation

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

Fast & Confident Probabilistic Categorisation

نویسنده

چکیده

منابع مشابه

A Probabilistic Model for Fast and Confident*

A Probabilistic Neighbourhood Translation Approach for Non-standard Text Categorisation

Probabilistic Models for Hierarchical Clustering and Categorisation: Applications in the Information Society

Aeóû Ø Ôôöøññòø Ó Óñôùøøö Ëëëëòòò¸éùùùò Ååöý ² Ï×ø¹ Ðð Óððððð¸íòòúö××øý Ó Äóòòóòº

Indexing for Fast Categorisation

عنوان ژورنال:

اشتراک گذاری